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(^ . Abstract. Sun and the CERT recommend for secure Java development 

^SJ ' to not allow partially initialized objects to be accessed. The CERT consid- 

ers the severity of the risks taken by not following this recommendation 
^ ■ as high. The solution currently used to enforce object initialization is 

to implement a coding pattern proposed by Sun, which is not formally 
0^ ■ checked. We propose a modular type system to formally specify the ini- 

tialization policy of libraries or programs and a type checker to statically 
check at load time that all loaded classes respect the policy. This allows 
to prove the absence of bugs which have allowed some famous privilege 
escalations in Java. Our experimental results show that our safe default 
policy allows to prove 91% of classes of java.lang, Java, security 
^ • and javax . security safe without any annotation and by adding 57 

' simple annotations we proved all classes but four safe. The type system 

and its soundness theorem have been formalized and machine checked 
using Coq. 
> 

m 

^_\ ■ 1 Introduction 

I ' I The initialization of an information system is usually a critical phase where 

^— s i essential defense mechanisms are being installed and a coherent state is being 

f^ I set up. In object-oriented software, granting access to partially initialized objects 

is consequently a delicate operation that should be avoided or at least closely 
monitored. Indeed, the CERT recommendation for secure Java development [2] 
. clearly requires to not allow partially initialized objects to be accessed (guideline 

rS I OBJ04-J). The CERT has assessed the risk if this recommendation is not followed 

S . and has considered the severity as high and the likelihood as probable. They 

consider this recommendation as a first priority on a scale of three levels. 

The Java language and the Java Byte Code Verifier (BCV) enforce some 
properties on object initialization, e.g. about the order in which constructors of 
an object may be executed, but they do not directly enforce the CERT recom- 
mendation. Instead, Sun provides a guideline that enforces the recommendation. 
Conversely, failing to apply this guidelines may silently lead to security breaches. 
In fact, a famous attack [3] used a partially initialized class loader for privilege 
elevation. 

We propose a twofold solution: (i) a modular type system which allows to 
express the initialization policy of a library or program, i. e. which methods may 
access partially initialized objects and which may not; and (ii) a type checker, 



which can be integrated into the BCV, to statically check the program at load 
time. To validate our approach, we have formalized our type system, machine 
checked its soundness proof using the Coq proof assistant, and experimentally 
validated our solution on a large number of classes from Sun's Java Runtime 
Environment (JRE). 

Section [3] overviews object initialization in Java and its impacts on secu- 
rity. Section |4] then informally presents our type system, which is then formally 
described in Section [5l Section |6] finally presents the experimental results we 
obtained on Sun's JRE. 

2 Related Work. 

Object initialization has been studied from different points of view. Freund and 
Mitchell ^ have proposed a type system that formalizes and enforces the initial- 
ization properties ensured by the BCV, which are not sufficient to ensure that no 
partially initialized object is accessed. Unlike local variables, instance fields have 
a default value (null, false or o) which may be then replaced by the program. 
The challenge is then to check that the default value has been replaced before 
the first access to the field {e.g. to ensure that all field reads return a non-null 
value). This is has been studied in its general form by Fahndrich and Xia [3], 
and Qi and Myers [5]. Those works are focused on enforcing invariants on fields 
and finely tracks the different fields of an object. They also try to follow the 
objects after their construction to have more information on initialized fields. 
This is an overkill in our context. Unkel and Lam studied another property of 
object initialization: stationary fields [12]. A field may be stationary if all its 
reads return the same value. There analysis also track fields of objects and not 
the different initialization of an object. In contrast to our analysis, they stop to 
track any object stored into the heap. 

Other work have targeted the order in which methods are called. It has been 
studied in the context of rare events {e.g. to detect anomaly, including intru- 
sions). We refer the interested reader to the survey of Chandola et al. [3]. They 
are mainly interested in the order in which methods are called but not about 
the initialization status of arguments. While we guarantee that a method taking 
a fully initialized receiver is called after its constructor, this policy cannot be 
locally expressed with an order on method calls as the methods (constructors) 
which needs to be called on a object to initialize it depends on the dynamic type 
of the object. 

3 Context Overview 

Fig. [T] is an extract of class ClassLoader of SUN's JRE as it was before 1997. 
The security policy which needs to be ensured is that resolveClass, a security 
sensitive method, may be called only if the security check 1. 5 has succeeded. 
To ensure this security property, this code relies on the properties enforced on 
object initialization by the BCV. 



1 public abstract class ClassLoader { 

2 private ClassLoader parent; 

3 protected ClassLoader ( ) { 

4 SecurityManager sm = System. getSecurityManager () ; 

5 if (sm != null) { sm. checkCreateClassLoader ( ) ; } 

6 this. parent = ClassLoader . getSystemClassLoader () ; 



protected final native void resolveClass (Class c) ; 



9 } 



Fig. 1. Extract of the ClassLoader of Sun's JRE 



Standard Java Object Construction. In Java, objects are initialized by 
calling a class-specific constructor which is supposed to establish an invariant on 
the newly created object. The BCV enforces two properties related to these con- 
structors. These two properties are necessary but, as we shall see, not completely 
sufficient to avoid security problems due to object initialization. 

Property 1. Before accessing an object, (i) a constructor of its dynamic type has 
been called and (ii) each constructor either calls another constructor of the same 
class or a constructor of the super-class on the object under construction, except 
for Java. lang. Object which has no super-class. 

This implies that at least one constructor of C and of each super-class of C is 
called: it is not possible to bypass a level of constructor. To deal with excep- 
tional behaviour during object construction, the BCV enforces another property 
— concisely described in The Java Language Specification [8], Section 12.5, or 
implied by the type system described in the JSR202 [1]). 

Property 2. If one constructor finishes abruptly, then the whole construction of 
the object finishes abruptly. 

Thus, if the construction of an object finishes normally, then all constructors 
called on this object have finished normally. Failure to implement this verification 
properly led to a famous attack 4 in which it was exploited that if code such as 
try { super ; } catch (Throwable e) { } in a constructor is not rejected by 
the BCV, then malicious classes can create security-critical classes such as class 
loaders. 

Attack on the class loader and the patch from Sun. However, even with 
these two properties enforced, it is not guaranteed that uninitialized objects can- 
not be used. In Fig.[Tl if the check fails, the method checkCreateClassLoader 
throws an exception and therefore terminates the construction of the object, 
but the garbage collector then call a finalize () method, which is an instance 
method and has the object to be collected as receiver (cf. Section 12.6 of [S]). 

An attacker could code another class that extends ClassLoader and has a 
finalize ( ) method. If run in a right-restricted context, e.g. an applet, the con- 
structor of ClassLoader fails and the garbage collector then call the attacker's 



1 public abstract class ClassLoader { 

2 private volatile boolean initialized; 

3 private ClassLoader parent; 

4 protected ClassLoader ( ) { 

5 SecurityManager sm = System. getSecurityManager () ; 

6 if (sm != null) { sm. checkCreateClassLoader () ; } 

7 this. parent = ClassLoader . getSystemClassLoader () ; 

8 this . initialized = true;} 

9 private void check () { 

10 if (! initialized) { 

11 throw new SecurityException ( 

12 "ClassLoader^ob ject^not^initialized" ) ; } } 

13 protected final void resolveClass (Class c) { 

14 this . check ( ) ; 

15 this . resolveClassO (c) ; } 

16 private native void resolveClassO (Class c) ; 

17 } 

Fig. 2. Extract of the ClassLoader of Sun's JRE 



finalize method. The attacker can therefore call the resolveClass method 
on it, bypassing the security check in the constructor and breaking the security 
of Java. 

The initialization policy enforced the BCV is in fact too weak: when a method 
is called on an object, there is no guarantee that the construction of an object 
has been successfully run. An ad-hoc solution to this problem is proposed by 
SUN [TT] in its Guideline 4-3 Defend against partially initialized instances of 
non-final classes: adding a special Boolean field to each class for which the 
developer wants to ensure it has been sufficiently initialized. This field, set to 
false by default, should be private and should be set to true at the end of 
the constructor. Then, every method that relies on the invariant established by 
the constructor must test whether this field is set to true and fail otherwise. If 
initialized is true, the construction of the object up to the initialization of 
initialized has succeeded. Checking if initialized is true allows to ensure 
that sensitive code is only executed on classes that have been initialized up to 
the constructor of the current class. Fig. [5] shows the same extract as in Fig. [T] 
but with the needed instrumentation (this is the current implementation as of 
JRE 1.6.0_16). 

Although there are some exceptions and some methods are designed to access 
partially initialized objects (for example to initialize the object), most methods 
should not access partially initialized objects. Following the remediation solution 
proposed in the CERT's recommendation or Sun's guideline 4-3, a field should 
be added to almost every class and most methods should start by checking 
this field. This is resource consuming and error prone because it relies on the 
programmer to keep track of what is the semantic invariant, without providing 
the adequate automated software development tools. It may therefore lead not to 



functional bugs but to security breaches, which are harder to detect. In spite of 
being known since 1997, this pattern is not always correctly applied to all places 
where it should be. This has lead to security breaches, see e.g., the Secunia 
Advisory SA10056 [ID]. 

4 The right way: a type system 

We propose a twofold solution: first, a way to specify the security policy which 
is simple and modular, yet more expressive than a single Boolean field; second, 
a modular type checker, which could be integrated into the BCV, to check that 
the whole program respects the policy. 

4.1 Specifying an Initialization Policy with Annotations. 

We rely on Java annotations and on one instruction to specify our initialization 
policy. We herein give the grammar of the annotations we use. 

V_ANNOT :— @Init | @Raw | @Raw(CLASS) 
R_ANNOT ::= @Pre (V_ANNOT) | @Post (V_ANNOT) 

We introduce two main annotations: @init, which specifies that a reference can 
only point to a fully initialized object or the null constant, and SRaw, which 
specifies that a reference may point to a partially initialized object. A third 
annotation, @Raw (CLASS) , allows to precise that the object may be partially ini- 
tialized but that all constructors up to and including the constructor of CLASS 
must have been fully executed. E.g., when one checks that initialized contains 
true in ClassLoader . resolveClass, one checks that the receiver has the type 
@Raw (ClassLoader) . The annotations produced by the v_annot rule are used 
for fields, method arguments and return values. In the Java language, instance 
methods implicitly take another argument: a receiver — reachable through vari- 
able this. We introduce a @Pre annotation to specify the type of the receiver at 
the beginning of the method. Some methods, usually called from constructors, 
are meant to initialize their receiver. We have therefore added the possibility to 
express this by adding a SPost annotation for the type of the receiver at the 
end of the method. These annotations take as argument an initialization level 
produced by the rule v_annot. 

Fig. [3] shows an example of @Raw annotations. Class ExIA has an instance 
field f , a constructor and a getter getF. This getter requires the object to be 
initialized at least up to ExIA as it accesses a field initialized in its constructor. 
The constructor of ExIB uses this getter, but the object is not yet completely 
initialized: it has the type Raw (ExIA) as it has finished the constructor of ExIA 
but not yet the constructor ExIB. If the getter had been annotated with (ainit 
it would not have been possible to use it in the constructor of ExIB. 

Another part of the security policy is the Setlnit instruction, which mimics 
the instruction this, initialized = true in Sun's guideline. It is implicitly 
put at the end of every constructor but it can be explicitly placed before. It 



1 class ExlA { 9 class ExlB extends ExlA{ 

2 private Object f; lo ExlB() { 

3 ExlA (Object o) { 11 super ; 

4 securityManagerCheck ( ) 12 ... = this.getF(); 

5 this . f = o; } 13 } 

6 @Pre (@Raw(ExlA) ) u } 

7 getF {return this.f;} 

8 } 

Fig. 3. Motivations for Raw (CLASS) annotations 

1 public CO { 

2 ... 

3 securityManagerCheck () ; // perform dynamic security checks 

4 Setlnit; // declare the object initialized up C 

5 Global . register (this) ; // the object is used with a method 

6 } // that only accept Raw(C) parameters 

Fig. 4. An Example with Setlnit 

declares that the current object has completed its initialization up to the current 
class. Note that the object is not yet considered fully initialized as it might be 
called as a parent constructor in a subclass. The instruction can be used, as in 
Fig|31 in a constructor after checking some properties and before calling some 
other method. 

Fig. [5] shows class ClassLoader with its policy specification. The policy en- 
sured by the current implementation of Sun is slightly weaker: it does not ensure 
that the receiver is fully initialized when invoking resolveClass but simply 
checks that the constructor of ClassLoader has been fully run. On this exam- 
ple, we can see that the constructor has the annotations @Pre(@Raw), mean- 
ing that the receiver may be completely uninitialized at the beginning, and 
(apost (@Raw (ClassLoader) ) , meaning that, on normal return of the method, 
at least one constructor for each parent class of ClassLoader and a constructor 
of ClassLoader have been fully executed. 

We define as default values the most precise type that may be use in each 
context. This gives a safe by default policy and lowers the burden of annotating 
a program. 

— Fields, method parameters and return values are fully initialized objects 
(written (ainit). 

— Constructors take a receivers uninitialized at the beginning (@Pre(@Raw)) 
and initialized up-to the current class at the end (written @Post (@Raw (C) ) 
if in the class c). 

— Other methods take a receiver fully initialized (SPre (Sinit) ). 

— Except for constructors, method receivers have the same type at the end 
as at beginning of the method (written @Post (A) if the method has the 
annotation @Pre (A) ). 



1 public abstract class ClassLoader { 

2 @Init private ClassLoader parent; 

3 @Pre(@Raw) @Post (SRaw (ClassLoader) ) 

4 protected ClassLoader ( ) { 

5 SecurityManager sm = System. getSecurityManager () ; 

6 if (sm != null) { sm. checkCreateClassLoader () ; } 

7 this. parent = ClassLoader .getSystemClassLoader () ; 
} 

9 @Pre(@Init) @Post ((3Init ) 
10 protected final native void resolveClass (@Init Class c) 



Fig. 5. Extract of the ClassLoader of Sun's JRE 



If we remove from Fig. [5] tlie default annotations, we obtain the original code 
in Fig. [1] It shows that despite choosing the strictest (and safest) initialization 
policy as default, the annotation burden can be kept low. 



4.2 Checking the InitiaUzation PoUcy. 

We have chosen static type checking for at least two reasons. Static type checking 
allows for more performances (except for some rare cases), as the complexity of 
static type checking is linear in the code size, whereas the complexity of dynamic 
type checking is linear in the execution time. Static type checking also improves 
reliability of the code: if a code passes the type checking, then the code is correct 
with respect to its policy, whereas the dynamic type checking only ensures the 
correction of a particular execution. 

Reflection in Java allows to retrieve code from the network or to dynamically 
generates code. Thus, the whole code may not be available before actually exe- 
cuting the program. Instead, code is made available class by class, and checked 
by the BCV at linking time, before the first execution of each method. As the 
whole program is not available, the type checking must be modular: there must 
be enough information in a method to decide if this method is correct and, if an 
incorrect method is found, there must exist a safe procedure to end the program 
(usually throwing an exception), i.e. it must not be too late. 

To a have a modular type checker while keeping our security policy simple, 
method parameters, respectively return values, need to be contra- variant, re- 
spectively CO- variant, i.e. the policy of the overriding methods needs to be at 
least as general as the policy of the overridden method. Note that this is not 
surprising: the same applies in the Java language (although Java imposes the 
invariance of method parameters instead of the more general contra- variance) , 
and when a method call is found in a method, it allows to rely on the policy of 
the resolved method (as all the method which may actually be called cannot be 
known before the whole program is loaded). 



x,y,r e Var f £ Field e £ Exc i e £ = N 
p G Prog ::— {classes G V{Class), main G Class, 

fields G Field -^ Type, lookup G Ciass --s> Mei/i -^ Meth} 
c G CTass ::— {super G Class±, methods G V{Meth), init G Meth} 
m G Mei/i :;= { instrs G /nsir array, handler G £ — >■ -Exc — ^ £x, 

pre G Type, post G Type, argtype G Type, rettype G Type} 



T G Type 

e G -Bipr 

ins G /nsir 



:= /nii | Raw{c) \ Raw 
:= null I a; | e./ 

:= X -f- e I x./ <— y I a; •<— new; c(i/) | if (•) jmp 
super (y) j s -f- r.m{y) \ return x \ Setlnit 

Fig. 6. Language Syntax. 



5 Formal Study of the Type System 

The purpose of this work is to provide a type system that enforces at foad time 
an important security property. The semantic soundness of such mechanism is 
hence crucial for the global security of the Java platform. In this section, we 
formally define the type system and prove its soundness with respect to an 
operational semantics. All the results of this section have been machine-checked 
with the Coq proof assistano 

SyntELX Our language is a simple language in-between Java source and Java 
bytecode. Our goal was to have a language close enough to the bytecode in order 
to easily obtain, from the specification, a naive implementation at the bytecode 
level while keeping a language easy to reason with. It is based on the decompiled 
language from Demange et al. [h\ that provides a stack-less representation of 
Java bytecode programs. Fig. [6] shows the syntax of the language. A program is 
a record that handles a set of classes, a main class, a type annotation for each field 
and a lookup operator. This operator is used do determine during a virtual call 
the method (p. lookup c m) (if any) that is the first overriding version of a method 
m in the ancestor classes of the class c. A class is composed of a super class (if 
any), a set of method and a special constructor method init. A method handles an 
array of instructions, a handler function such that (ni. handler i e) is the program 
point (if any) in the method m where the control flows after an exception e has 
been thrown at point i. Each method handles also four initialization types for the 
initial value of the variable this (m.pre), its final value (to. post), the type of its 
formal parameter (to,. argtype) and the type of its return value (to,. rettype). The 
only expressions are the null constant, local variables and field reads. For this 
analysis, arithmetic needs not to be taken into account. We only manipulate 
objects. The instructions are the assignment to a local variable or to a field, 
object creation (newjj, (non-deterministic) conditional jump, super constructor 



^ The development can be downloaded at |http: //www, irisa. f r/celtique/ext/rawtypes/| 
* For the sake of simplicity, each method has a unique formal parameter arg. 
^ Here, the same instruction allocates the object and calls the constructor. At bytecode 
level this gives raise to two separated instructions in the program (allocation and 
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Fig. 7. Semantic Domains. 

call, virtual method call, return, and a special instruction that we introduce for 
explicit object initialization: Setlnit. 

Semantic Domains Fig. [7] shows the concrete domain used to model the pro- 
gram states. The state is composed of the current method m, the current pro- 
gram point i mm (the index of the next instruction to be executed in m.instrs), a 
function for local variables, a heap, a call stack and an exception flag. The heap 
is a partial function which associates to a location an object [c, Cinit , o] with 
c its type, Ci„it its current initialization level and o a map from field to value 
(in the sequel o is sometimes confused with the object itself). An initialization 
Cinit S Class means that each constructors of Ci„it and its super-classes have 
been called on the object and have returned without abrupt termination. The 
exception flag is used to handle exceptions: a state (• ■ •)e with e G Exc is reached 
after an exception e has been thrown. The execution then looks for a handler 
in the current method and if necessary in the methods of the current call stack. 
When equal to _L, the flag is omitted (normal state). The call stack records the 
program points of the pending calls together with their local environments and 
the variable that will be assigned with the result of the call. 



Initialization types We can distinguish three different kinds of initialization 
types. Given a heap a we define a value type judgment h\- v : t between values 
and types with the following rules. 

u{l) = [Cdyn, Cinit, O] 

Vc', Cdyn :< c' Ac ^ c' => Cinit ^ c' a{l) = [c, c, o] 
a h null : T a \- I : Raw'- a h I : Rawic) a h I : Init 

The relation ^ here denotes the reflexive transitive closure of the relation induced 
by the super element of each class. Raw denotes a reference to an object which 
may be completely uninitialized (at the very beginning of each constructor). Init 
denotes a reference to an object which has been completely initialized. Between 
those two "extreme" types, a value may be typed as Raw{c) if at least one 

later constructor invocation) but the intermediate representation generator [5] on 
which we rely is able to recover such construct. 
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m.instrs[i] — x -i^ new c{y) x 7^ this Alloc{a,c,l,a') a' h p{y) : c.init.argtype 
{m, i, p, a, cs) => (c.init, 0, [■ t-^ nuU][this M- l][arg M- p{y)],a' , (m, i, p, x) :: cs) 

m.\nstrs[i] — Setlnit m = c.init p{this) = I Setlnit{a,c,l,a') 
(m, i, p, a, cs) => {m, i+1, p, a , cs) 

Tn.instrs[i] — return x p{this) = I ((Vc, m 7^ c.init) ^ a — a') 

(Vc, m = c.init => Setlmt{a, c, I, a') A x = this) 

{m, i, p, a, (m', i' , p' , r) :: cs) => (m', i' + l, p'[r h-j. p{x)],a' , cs) 

Fig. 8. Operational Semantics (excerpt). 



constructor of c and of each parent of c has been executed on all objects that 
may be reference from this value. We can derive from this definition the sub- 
typing relation Init C Raw{c) \— Raw{c') C Raw^ if c ^ c'. It satisfies the 
important monotony property 

Vcr G H, Vf G V, Vri , T2 G Type, ti \Z T2 A a \- v : ti ^ a ^ v : T2 

Note that the sub- typing judgment is disconnected from the static type of object. 
In a first approach, we could expect to manipulate a pair (c, r) with c the static 
type of an object and r its initialization type and consider equivalent both types 
{c, Raw{c)) and {c,Init). Such a choice would however impact deeply on the 
standard dynamic mechanism of a JVM: each dynamic cast from A to B (or a 
virtual call on a receiver) would requires to check that an object has not only 
an initialization level set up to A but also set up to i3. 



Operational Semantics We define the operational semantics of our language 
as a small-step transition relation over program states. A fixed program p is 
implicit in the rest of this section. Fig. [8] presents some selected rules for this re- 
lation. The rule for the new instruction includes both the allocation and the call 

Expression typing 



Lhe.f : (p.fields /) L h x : L{x) L h null : Imt 

Instruction typing 

L\- e: T a; / this L{y) C (p.fields /) 



m h a: •<— e : L — >• L[x ^ t] m\- x.f <— y : L -^ L r,m\- if * jmp : L -^ L 

L(this) IZ 771. post L{x) C m.rettype (Vc, m — c.init => L{this) C 7?ai(;(c.super)) 

m h return x : L ^- L 
L{y) C c.init.argtype c — c. super L{y) C c'.init.argtype 



mh X -ir- new c{y) : L -^ L[x tn- Init] c.init h super{y) : L — ^ L[this h->- Raw{c') 

L{r) C m.pre L{y) C m.argtype L{this) C 7?ai(;(c. super) 



m\- X ■h- r.m'{y) : L -^ L[r f-> m. post] [a:: h^ m.rettype] c.init h Setlnit : L -^ L 

Fig. 9. Flow sensitive type system 
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to the constructor. We use the auxiliary predicate Alloc{a, c, I, a') which allocate 
a fresh location / in heap a with type c, initialization type equals to _L and all 
fields set equal to null. The constraint a' h p{y) : c.init.argtype explicitly asks 
the caller of the constructor to give a correct argument with respect to the pol- 
icy of the constructor. Each call rules of the semantics have similar constraints. 
The execution is hence stuck when an attempt is made to call a method with 
badly typed parameters. The Setlnit instruction updates the initialization level 
of the object in this. It relies on the predicate Setlnit{a,c,l,(j') which specifies 
that a' is a copy of a where the object at location I has now the initialization 
tag set to c if the previous initialization was c. super. It forces the current object 
(this) to be considered as initialized up to the current class (i.e. as if the con- 
structor of the current class had returned, but not necessarily the constructors 
of the subsequent classes). This may be used in the constructor, once all fields 
that need to be initialized have been initialized and if some method requiring 
a non-raw object needs to be called. Note that this instruction is really sensi- 
tive: using this instruction too early in a constructor may break the security of 
the application. The return instruction uses the same predicate when invoked 
in a constructor. For convenience we requires each constructor to end with a 
return this instruction. 

Typing judgment Each instruction ins of a method m is attached a typing 
rule (given in Fig. [5]) Jti l~ ins : L ^f L' that constraint the type of variable 
before {L) and after (L') the execution of ins. 

Definition 1 (Well- typed Method). A method m is well- typed if there exists 
flow sensitive variable types L E C ^ Var -^ Type such that 

— m.pre C L(0, this) and m.argtype C L(0, arg), 

— for all instruction ins at point i in m and every successor j of i, there exists 
a map of variable types L' G Var — > Type such that L' C L{j) and the typing 
judgment m h ins : L{i) -^ L' holds. If i is in the handler j of an exception 
e (\.e (?Ti. handler i e — j)) then L{i) C L{j). 

The typability of a method can be decided by turning the set of typing 
rules into a standard dataflow problem. The approach is standard T] and not 
formalized here. 

Definition 2 (Well- typed Program). A program p is well- typed if all its 
methods are well-typed and the following constraints holds: 

1. for every method m that is overridden by a method m' (i.e there exists c, 
such that (p. lookup c m = m')), 

m.pre C yn'.pre A m.argtype C jri'.argtype A 
m.post Zl to'. post A TO,.rettype Zl TO'.rettype 

2. in each method, every first point, jump target and handler point contain an 
instruction and every instruction (except return) has a next instruction, 

3. the default constructor c.init of each class c is unique. 

In this definition only point 1 is really specific to the current type system. The 
other points are necessary to established the progress theorem of the next section. 
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Type soundness We rely on an auxiliary notion of well-formed states that cap- 
ture the semantics constraints enforce by the type system. A state (to, i, p, a, cs) 
is well-formed (wf) if there exists a type annotation Lp G [Meth x >C) — > ( Var — > 
Type) such that 

V; e £, Vo e O, a{l) ^o^gV- o{f) : (p.fields /) (wf. heap) 

Va; € Var, a h p[x) : Lp[m,i\{x) (wf. local variables) 

\l{m' ,i' , p' ,r) e cs, Vx, cr h p' {x) : Lp[m' ,i']{x) (wf. call stack) 

Given a well-typed program p we then establish two key theorems. First, 
any valid transition from a well-formed state leads to another well-formed state 
{preservation) and then, from every well-formed state there exists at least a 
transition (progress). As a consequence we can establish that starting from an 
initial state (which is always well-formed) the execution is never stuck, except 
on final configuration. This ensures that all initialization constraints given in the 
operational semantics are satisfied without requiring any dynamic verification. 

Limitations The proposed language has some limitations compared to the 
Java (bytecode) language. Static fields and arithmetic have not been introduced 
but are handled by our implementation and do not add particular difficulties. 
Arrays have not been introduced in the language neither. Our implementation 
conservatively handles arrays by allowing only writes of Init references in arrays. 
Although this approach seems correct it has not been proved and it is not fiex- 
ible enough (cf. Section [5]). Multi-threading as also been left out of the current 
formalization but we conjecture the soundness result still holds with respect to 
the Java Memory Model because of the flow insensitive abstraction made on the 
heap. As for the BCV, native methods may brake the type system. It is their 
responsibility to respect the policy expressed in the program. 

6 A Case Study: Sun's JRE 

In order to show that our type system allows to verify legacy code with only a 
few annotations, we implemented a standalone prototype, handling the full Java 
bytecode, and we tested all classes of packages java.lang, java. security and 
javax. security of the JRE1.6.0_20. 

348 classes out of 381 were proven safe w.r.t. the default policy without any 
modification. By either specifying the actual policy when the default policy was 
too strict, or by adding cast instructions (see below) when the type system was 
not precise enough, we were able to verify 377 classes, that is to say 99% of 
classes. We discuss below the 4 remaining classes that are not yet proven correct 
by our analysis. The modifications represent only 55 source lines of code out 
of 131,486 for the three packages studied. Moreover most code modifications 
are to express the actual initialization policy, which means existing code can be 
proven safe. Only 45 methods out of 3,859 (1.1%) and 2 fields out of 1,524 were 
annotated. Last but not least, the execution of the type checker takes less than 
20 seconds for the packages studied. 
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nsetlnit instructions 

■ Annotations on fields 

Annotations on methods: 
Don receiver 
□ on arguments 
■ on return value 

Fig. 10. Distribution of the 47 annotations and 6 instructions added to successfully 
type the three packages of the JRE. 

Adapting the security policy Fig. [TU] details the annotations and the Setinit 
added to specify the security policy. In the runtime library, a usual pattern 
consists in calling methods that initialize fields during construction of the object. 
In that case, a simple annotation @Pre (@Raw (super (C) ) ) on methods of class 
C is necessary. These cases represent the majority of the 37 annotations on 
method receivers. 6 annotations on method arguments are used, notably for some 
methods of java. lang. SecurityManager which check permissions on an object 
during its initialization. The instruction Setinit is used when a constructor 
initializes all the fields of the receiver and then call methods on the receiver that 
are not part of the initialization. In that case the method called need at least 
a Raw(C) level of initialization and the Setinit instruction allows to express 
that the constructor finished the minimum initialization of the receiver. Only 6 
Setinit intructions are necessary. 

Cast instructions Such a static and modular type checking introduces some 
necessary loss of precision — which cannot be completely avoided because of 
computability issues. To be able to use our type system on legacy code with- 
out deep modifications, we introduce two dynamic cast operators: (init) and 
(Raw) . The instruction y = (init)x; allows to dynamically check that x points 
to a fully initialized object: if the object is fully initialized, then this is a simple 
assignation to y, otherwise it throws an exception. As explained in Section |3l 
the invariant needed is often weaker and the correctness of a method may only 
need a Raw{c) reference, y = (Raw(C) )x dynamically checks that x points to 
an object which is initialized up to the constructor of class C. 

Only 4 cast instructions are necessary. There are needed in two particular 
cases. First, when a field must be annotated, but annotation on fields were only 
necessary on two fields — they imply the use of 3 (init) cast instructions. The 
second case is on a receiver in a finalize () method that checks that some 
fields are initialized, thereby checking that the object was Raw ( C ) but the type 
system could not infer this information. The later case implies to use the unique 
(Raw(C)) instruction added. 

Remaining classes Finally, only 4 classes are not well-typed after the previous 
modifications. Indeed the compiler generates some code to compile inner classes 
and part of this code needs annotations in 3 classes. These cases could be handled 
by doing significant changes on the code, by adding new annotations dedicated 
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to inner classes or by annotating directly the bytecode. The one class remaining 
is not typable because of the limited precision of our analysis on arrays: one can 
only store @Init values in arrays. To check this later class, our type system 
needs to be extended to handle arrays more precisely but this is left for future 
work. 

Special case of finalize methods As previously exposed, finalize () methods 
may be invoked on a completely uninitialized receiver. Therefore, we study the 
case of finalize ( ) methods in the packages java . * and javax . *. In the classes 
of those packages there are 28 finalize () methods and only 12 succeed to be 
well-typed with our default annotation values. These are either empty or do not 
use their receiver at all. For the last 16 classes, the necessary modifications 
are either the use of cast instructions when the code's logic guarantees the suc- 
cess of cast, or the addition of @Pre (@Raw) annotations on methods called on 
the receiver. In that case, it is important to verify that the code of any called 
method is defensive enough. Therefore, the type system forced us to pay atten- 
tion to the cases that could lead to security breaches or crashes at run time for 
finalize methods. After a meticulous checking of the code we added the 
necessary annotations and cast instructions that allowed to verify the 28 classes. 

7 Conclusion and Future Work 

We have proposed herein a solution to enforce a secure initialization of objects in 
Java. The solution is composed of a modular type system which allows to manage 
uninitialized objects safely when necessary, and of a modular type checker which 
can be integrated into the BCV to statically check a program at load time. 
The type system has been formalized and proved sound, and the type-checker 
prototype has been experimentally validated on more than 300 classes of the 
Java runtime library. 

The experimental results point out that our default annotations minimize the 
user intervention needed to type a program and allows to focus on the few classes 
where the security policy needs to be stated explicitly. The possible adaptation 
of the security policy on critical cases allows to easily prevent security breaches 
and can, in addition, ensure some finer initialization properties whose violation 
could lead the program to crash. On one hand, results show that such a static 
and modular type checking allows to prove in an efhcient way the absence of 
bugs. On the other hand, rare cases necessitate the introduction of dynamic 
features and analysis to be extended to analyze more precisely arrays. With 
such an extension, the checker would be able to prove more classes correct, but 
this is left for future work. 

On the formalization side, an obvious extension is to establish the soundness 
of the approach in presence of multi-threading. We conjecture the soundness 
result still holds with respect to the Java Memory Model because of the flow 
insensitive abstraction made on the heap. 

The prototype and the Coq formalization and proofs can be downloaded from 
|http : //www. irlsa. f r/celtique/ext /rawt ypes / 
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